Sheboygan
Chain-of-Thought Reasoning In The Wild Is Not Always Faithful
Arcuschin, Iván, Janiak, Jett, Krzyzanowski, Robert, Rajamanoharan, Senthooran, Nanda, Neel, Conmy, Arthur
Chain-of-Thought (CoT) reasoning has significantly advanced state-of-the-art AI capabilities. However, recent studies have shown that CoT reasoning is not always faithful, i.e. CoT reasoning does not always reflect how models arrive at conclusions. So far, most of these studies have focused on unfaithfulness in unnatural contexts where an explicit bias has been introduced. In contrast, we show that unfaithful CoT can occur on realistic prompts with no artificial bias. Our results reveal non-negligible rates of several forms of unfaithful reasoning in frontier models: Sonnet 3.7 (16.3%), DeepSeek R1 (5.3%) and ChatGPT-4o (7.0%) all answer a notable proportion of question pairs unfaithfully. Specifically, we find that models rationalize their implicit biases in answers to binary questions ("implicit post-hoc rationalization"). For example, when separately presented with the questions "Is X bigger than Y?" and "Is Y bigger than X?", models sometimes produce superficially coherent arguments to justify answering Yes to both questions or No to both questions, despite such responses being logically contradictory. We also investigate restoration errors (Dziri et al., 2023), where models make and then silently correct errors in their reasoning, and unfaithful shortcuts, where models use clearly illogical reasoning to simplify solving problems in Putnam questions (a hard benchmark). Our findings raise challenges for AI safety work that relies on monitoring CoT to detect undesired behavior.
- North America > United States > Nevada > Carson City (0.14)
- North America > United States > Wisconsin > Sheboygan County > Sheboygan (0.14)
- Asia > Middle East > Iraq (0.04)
- (28 more...)
- Leisure & Entertainment (0.68)
- Media > Film (0.46)
- Education (0.46)
This artist uses AI to show how the world's streets could be more pedestrian-friendly
Artist and musician Zach Katz virtually transforms the streets of the world's major cities in order to show how the space could be made more hospitable to pedestrians. Every day, he posts new creations on his Twitter account and, in the space of a few weeks, has become something of a star among Internet users, city planners and politicians. How do we imagine the downtown zones of the future? The Twitter account @Betterstreetsai explores this question by using DALL-E, the artificial intelligence (AI) that has started a real trend on the Web. Here, fountains, green space, rails or even roads reserved for bikes and cyclists take over the space to dramatically change the urban landscape.
- Oceania > Australia (0.07)
- North America > United States > Wisconsin > Sheboygan County > Sheboygan (0.07)
- North America > United States > New York (0.07)
- North America > United States > California > Los Angeles County > Los Angeles (0.07)